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Abstract 

Background: Studies of protein association with DNA on a genome wide scale are possible through methods like 
ChlP-Chip or ChlP-Seq. Massive problems with false positive signals in our own experiments motivated us to revise 
the standard ChlP-Chip protocol. Analysis of chromosome wide binding of the alternative sigma factor o 32 in 
Escherichia coli with this new protocol resulted in detection of only a subset of binding sites found in a previous 
study by Wade and colleagues. We suggested that the remainder of binding sites detected in the previous study 
are likely to be false positives. In a recent article the Wade group claimed that our conclusion is wrong and that the 
disputed sites are genuine o 32 binding sites. They further claimed that the non-detection of these sites in our study 
was due to low data quality. 

Results/discussion: We respond to the criticism of Wade and colleagues and discuss some general questions of 
ChlP-based studies. We outline why the quality of our data is sufficient to derive meaningful results. Specific points 
are: (i) the modifications we introduced into the standard ChlP-Chip protocol do not necessarily result in a low 
dynamic range, (ii) correlation between ChlP-Chip replicates should not be calculated based on the whole data set 
as done in transcript analysis, (iii) control experiments are essential for identifying false positives. Suggestions are 
made how ChlP-based methods could be further optimized and which alternative approaches can be used to 
strengthen conclusions. 

Conclusion: We appreciate the ongoing discussion about the ChlP-Chip method and hope that it helps other 
scientist to analyze and interpret their results. The modifications we introduced into the ChlP-Chip protocol are a 
first step towards reducing false positive signals but there is certainly potential for further optimization. The 
discussion about the o 32 binding sites in question highlights the need for alternative approaches and further 
investigation of appropriate methods for verification. 
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Background 

In a recent article in this journal we described o.ur ex- 
perience with application of the ChlP-Chip method [1]. 
Our focus was on the replication protein SeqA which 
had been shown to be specific for hemi-methylated 
GATC-sequences [2]. To gain a deeper understanding of 
the DNA-binding of SeqA we applied a widely used 
standard ChlP-Chip protocol [3]. As a proof that the 
method works well in our hands we performed ChlP- 
Chip experiments with RNA-polymerase antibody as 
published previously [4]. To our great surprise the bind- 
ing sites we detected for SeqA and RNAP were highly 
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similar. This was absolutely unexpected because many 
SeqA-bound DNA-regions detected in this experiment 
did not contain many of the established GATC binding 
sequences. One possibility is that these non-canonical 
protein-DNA interactions could be genuine binding sites 
and therefore an indication that our understanding of 
DNA-binding proteins is incomplete. We considered the 
alternative possibility that our surprising results might 
be artifacts. The key experiment to distinguish between 
these explanations was a ChlP-Chip using a AseqA 
strain with a SeqA antibody. Also in this experiment we 
detected binding signals which unambiguously demon- 
strated unspecifically enriched chromosomal regions via 
the used standard method. The unspecific signals could 
be caused by binding of non-target proteins by the 
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antibody. In deed the quality and type of antibody are 
critical for the quality of ChIP based methods [5,6]. 
However, the antibody turned out not to be the problem 
in this case. Evaluation of the ChlP-Chip method led to 
the identification of four causes for these false signals: 
i) non-unique sequences, ii) incomplete reversion of 
crosslinks, iii) inappropriate retention of protein in spin- 
columns and iv) insufficient RNase treatment [1]. We 
established a modified ChlP-Chip protocol to minimize 
the effects of these sources of false positive ChIP peaks 
and applied it using the SeqA antibody. The SeqA bind- 
ing pattern detected with this new protocol was radically 
different from the standard protocol with almost no 
overlap. This means that specific details of a protocol 
changed the chromosomal binding pattern completely. 
The SeqA binding sites we detected with our modified 
method were exclusively canonical binding sites with 
binding signals being proportional to the number of 
GATC sites in the respective regions. Thus, in the case 
of SeqA the non-canonical protein-DNA interactions 
identified with the standard ChlP-Chip method are 
artifacts. 

In 2006, Wade and colleagues published a ChlP-Chip 
study on the alternative sigma factor a 32 [7]. In addition 
to 38 known binding sites they surprisingly found 49 
new non-canonical binding sites. These non-canonical 
sites could be either genuine binding sites or artifacts. 
Wade et al. concluded that these sites are genuine a 32 
binging sites. Based on our experience with SeqA de- 
scribed above we considered the possibility that these 
non-canonical sites might instead be false positives. This 
idea was supported by the lack of a control ChlP-Chip 
experiment in the Wade et al. study and the fact that 
they refer to the same protocol that gave the enormous 
false positive rate in our first SeqA attempt [1]. In our 
study, the AseqA control strain was crucial for identify- 
ing false positives. We applied our modified ChlP-Chip 
protocol to analyze a 32 binding on the E. coli chromo- 
some. We detected almost all of the canonical a 32 bind- 
ing sites but only very few of the non-canonical sites. 
Taken together these findings led to the conclusion that 
the majority of non-canonical a 32 sites described by 
Wade et al. are probably not genuine binding sites but 
instead false positives [1]. In a recent article in this jour- 
nal Wade and colleagues published a new study claiming 
that our conclusion was wrong and that the non- 
canonical a 32 sites are in fact genuine binding sites [8]. 
They base their view on ChIP coupled with qPCR ana- 
lyses of 4 out of the 49 "Disputed a 32 sites" (DSTs) and 
the claim that the quality of our ChlP-Chip data is low 
compared to their study. In addition they find that 
the specific ChIP enrichment is reduced because of 
the increased stringency changes we introduced into the 
protocol. Here, we respond to the new study of the 



Wade group and use this to discuss some critical ques- 
tions surrounding ChlP-Chip analysis. 

What is good data quality in ChlP-Chip studies? 

As with most methods the quality of ChlP-Chip derived 
data varies. This might be due to the details of the meth- 
odology, the type and quality of the antibody, the 
biological samples, as well as the performance and ex- 
perience of the experimenter. Wade and colleagues 
reanalyzed their own and our data regarding dynamic 
range and reproducibility and concluded that both as- 
pects were better in their study. We appreciate if other 
scientists re-analyze our data to come to their own con- 
clusion. This is why we routinely store our ChlP-Chip 
data in public databases such as the Gene Expression 
Omnibus (GEO) which is publically accessible. The re- 
quired detailed description of experimental procedures 
and data processing together with storage of raw as well 
as processed data is essential for thorough follow-up 
analysis. Thus, we recommend the open access storage 
of genome wide ChIP studies in general. Unfortunately, 
the debated data of Wade and colleagues are not easily 
accessible. Wade and colleagues might want to consider 
storage of their data in a public database to facilitate 
data comparison and analysis by others. Below, we dis- 
cuss questions related to the dynamic range and repro- 
ducibility of ChlP-Chip derived data. 

Dynamic range 

We accept the claim by Wade and colleagues that the 
dynamic range of their study is higher than in our data 
set. However, the dynamic range is not a suitable quality 
measurement for inter-platform comparison. The main 
reason why the dynamic range in our study is lower is 
because we used an improved microarray with a higher 
probe number and density. With such a higher probe 
density the ChlP-DNA is distributed across a greater 
number of probes (Figure 1). This would certainly de- 
crease the dynamic range but at the same time greatly 
increase data quality. This is because binding site detec- 
tion can be assisted by comparisons between multiple 
neighboring probe signals (Figure 1). Qi and colleagues 
tested the relationship between probe density and confi- 
dence in binding site detection systematically and came 
to the conclusion that "a single high density microarray 
(100-bp probe spacing) provides better spatial resolution 
than three experimental replicates using lower density 
arrays (300-bp probe spacing)" [9]. 

If the lower dynamic range of our ChlP-Chip data is 
the reason why we did not detect the "disputed a 32 sites" 
(DSTs), then this would only apply to the targets with 
the lowest values. However, the three DSTs with the 
highest ChlP-score in the Wade et al. paper (ytfl, ygcl 
and yghj) were not detected in our study. At the same 
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Figure 1 DNA binding to low and high probe-density microarrays. On low density microarrays, enriched DNA fragments bind to a single 
probe (A). This leads to a single strong signal. On high density microarrays the same DNA can bind to more than one specific probe (B). This 
decreases the signal intensity but increases the number of signals to be used for binding site detection. The dynamic range of raw data will thus 
be decreased compared to low density microarrays but not the data quality. 



time we detected known targets that showed a lower score 
in the Wade et al study (for example grpE, yccV, hep A). 

Furthermore, the question remains if our changes to 
the ChlP-Chip methodology decrease the dynamic range 
as suggested by Wade and colleagues and whether such 
a decrease is relevant to this discussion over the 
identification of false positives. We believe that our 
modifications of the ChlP-Chip protocol do not prohibit 
necessary dynamic ranges. Support for this comes from 
a SeqA ChlP-Chip experiment with synchronized E. coli 
cells [10]. Data was obtained from cells shortly after syn- 
chronous initiation of replication (5 or 6 min) using both 
the standard protocol [11] and our modified version 
[10]. As Wade and colleagues point out the data from 
both protocols show similar results (Figure 2). However 
the dynamic range appears to be higher with our modi- 
fied protocol (98.6) compared to the study with the 
standard protocol (5.1; Figure 2). The critical point here 
is that the same antibody, the same E. coli strain and the 
same microarrays were used for the experiments. Also for 
genome wide analyses of SeqA binding in unsynchronized 
E. coli cells our changed method gave higher dynamic 
ranges compared to the standard protocol [1,11]. 

Reproducibility 

For reliable data, experiments need to be reproducible 
and the data from the replicates should be comparable. 
For ChlP-Chip data, a straight-forward analysis of repro- 
ducibility is difficult. This is because most of the data on 
the microarray can be considered background. Even with 
a protein of interest binding some hundred times, this 



will be only a small fraction compared to the whole gen- 
ome. Subsequently only some probes are expected to 
give a relevant signal. The remaining probes will detect 
only background DNA. For calculations of correlation 
coefficients, as done by Wade and colleagues, this means 
that one mainly calculates the correlation of the back- 
ground signal. Thus we consider the information gain of 
this number limited. 

The way we incorporated the reproducibility in our 
study was to consider only signals as relevant that 
reached a certain threshold in both replicates, as was the 
case in the analysis by Wade and colleagues. Since we 
have in this way detected in our data almost all known 
and published a 32 targets, we consider the reproducibil- 
ity of our data as solid. 

There are other ways to assess the reproducibility of 
ChlP-Chip and ChlP-Seq data. The critical point is to 
focus on the target sites. This can be difficult if one lacks 
an estimate of the expected number of targets. One way 
to deal with this is a stepwise comparison of ranked 
target-lists and compute the fraction of overlapping tar- 
gets in the highest 10, 20, 30, ... %. In our previous study 
we used the highest 1.000 probes (out of 40.000) to plot 
Venn diagrams for experiment comparison [1]. Such a 
quantitation of target reproducibility helps the reader in 
data interpretation and should be provided if possible. 

Control experiments 

While discussion about data quality is certainly import- 
ant, it distracts from the main point of our study. The 
erroneous data we got for SeqA using the standard 
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Figure 2 Comparison of SeqA ChlP-Chip data of Sanchez-Romero et al. [11] (red) and Waldminghaus et al. [10] (green). Both studies 
used E. coli dnaC2 mutant cells synchronized regarding DNA replication 5 or 6 minutes after initiation and the same antibody for the ChIP 
reaction. While Sanchez-Romero and colleagues used the original ChlP-Chip protocol, Waldminghaus and colleagues used the modified protocol. 
While both signal patterns correspond to the GATC density (blue) as expected for SeqA, the dynamic range varies between 5.1 for the Sanchez- 
Romero et al. experiment and 98.6 for the Waldminghaus et al. experiment. Grey dots show corresponding peaks to a GATC density of > 5 
(Moving window of 500 bp; step size 100 bp). 



protocol had an excellent dynamic range and reproduci- 
bility was high. In fact we got the highest dynamic range 
with our control using a AseqA strain and the SeqA anti- 
body. However, all of the detected peaks in this experi- 
ment must be false. This is actually what we consider 
the most dangerous fact about the false positive peaks 
we identified. They appear as wonderful, reproducible 
hits and not as noise (Figure 3). In our view this is why 
such false positive enrichments could easily be accepted 
and published as true binding sites. We have discussed 
the importance of control experiments as a critical step 
to identify false positives [1]. Our control experiment for 
the ChlP-Chip detection of the heat shock sigma factor 
a 32 in heat shocked E. coli cells was a similar experiment 
using non-heat shocked cells. It is remarkable that Wade 
and colleagues did not include any ChlP-Chip control 
experiment in their a 32 study. 

In a recent study, binding of LeuO to the Salmonella 
enterica genome was analyzed by ChlP-Chip [12]. Dillon 
and colleagues found 261 binding sites using the 
ChlPOTle peak finding program. However, they were 
aware of the possibility of false positives in ChlP-Chip 
data and performed a mock control experiment. In this 
control 83 peaks were detected overlapping with the 
261 potential LeuO peaks. Dillon and colleagues identi- 
fied them as false-positives and considered only the 



remaining 178 as likely LeuO binding sites. The ap- 
proach of Dillon and colleagues supports our argument 
that, firstly, false positives are a serious problem in 
ChlP-Chip studies and secondly, control ChlP-Chip ex- 
periments can help to detect and reduce false-positives. 
This is also true for ChlP-Seq where it was shown that 
peak-scoring algorithms using 2-sample scoring (scoring 
sample vs. control experiment) perform better than 
single-sample scoring ones [13]. 

Are the new sigma32 targets found by Wade and 
colleagues real targets or false positives? 

Although Wade and colleagues did not include a ChlP- 
Chip control experiment in their original study, they 
performed ChlP-qPCR experiments of 3 selected loci 
out of the 49 "disputed a 32 sites" [7]. Notably, here they 
included a non-heat shock control. In their new study 
they analyze 3 more loci [8]. The six analyzed loci in- 
deed show temperature dependent association with a 32 . 
These results are contradictory to our ChlP-Chip data 
where no significant temperature dependent association 
at the respective loci was found. Further experiments 
might help to resolve this contradiction. It is even more 
important to analyze the 43 remaining DSTs for which 
no temperature dependent change in a 32 binding has 
been shown so far. We suggest that alternative methods 
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Figure 3 False positive enrichment peaks resemble true binding sites in ChlP-Chip experiments. Our modified ChlP-Chip protocol (green) 
resulted in enrichment data corresponding to the density of GATC binding sites (blue; moving window of 500 bp; step size 100 bp) as expected 
for SeqA binding with low signals in low GATC regions (A) and high signals in GATC dense chromosomal regions (B) [1]. The standard ChlP-Chip 
protocol (red) resulted in erroneous enrichment peaks in regions with low GATC density (A) and low signals in GATC rich regions (B). The shape 
and signal level of the false positive signal example in the ytfl gene region (red in A) and the true positive in the caiC gene region (green in B) 
are similar. 



are needed for verification. Temperature dependent in- 
duction of mRNAs at the respective regions could be 
considered additional evidence but was not found for 
most "disputed a 32 sites" [7,14,15]. Also sequences re- 
sembling the well characterized a 32 target promoter se- 
quence in the debated regions would promote them as 
genuine binding sites. However, Wade et al. note that 
for many of the DSTs no such typical binding sequences 
could be found [7]. They suggest that at these sites a 32 
binding is mediated by transcriptional activators that are 
functional only after heat shock. Identification of these 
predicted factors would certainly be important for the 
discussion about DSTs One possibility to find these fac- 
tors would be mChIP, where proteins co-purified in a 
ChIP reaction are analyzed [16]. 

What would be an appropriate alternative method to 
clarify disputed ChIP sites? For protein interactions, a 
popular approach is to compliment one pull down ex- 
periment with the reverse pull down, meaning both pro- 
tein partners should be interchangeable as 'bait' and 
prey. For ChIP experiments the reverse approach would 
be to use the DNA as bait to catch the proteins which 
are proposed to bind this motif. Such methods have ac- 
tually been developed [17,18]. 



Can the ChIP protocol still be improved? 

One thing that becomes clear from both our study and 
that of the current Wade study is that the experimental 
details can change the output of ChlP-Chip experiments 
dramatically [1,8]. We did introduce some changes to 
the method that completely changed the detected SeqA 
binding pattern towards what we believe to be a more 
reasonable result. However, we also believe that there is 
still room for improvements. One main point in consid- 
eration is the use of Spin-X columns for washing of the 
IP bound to Protein A agarose beads. We had found that 
a problem is the unspecific binding of highly transcribed 
and consequently highly crosslinked pieces of DNA to 
the column matrix. We suggest that omission of such 
columns solves the problem that these unspecific bound 
fragments are washed off the column in the elution step 
and appear as peaks on the microarray. Instead of using 
the columns for collection of the agarose beads we use 
simple centrifugation and supernatant removal. Wade 
and colleagues make the point that the columns are ne- 
cessary to achieve thorough washing. Interestingly they 
actually omit the columns in the first step of the proced- 
ure where the beads are separated from the cell extract 
[8]. While in the original method description this is done 
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using the Spin-X columns [7], Wade and colleagues col- 
lect the beads by centrifugation without columns in this 
first step just as we suggested to do [8]. This first step is 
probably the point where most unspecific binding to the 
column occurs and the omission of columns in this step 
would be expected to greatly facilitate a reduction in 
false signals. The following washing steps might be less 
critical in this respect and the use of Spin-X columns 
possible or even beneficial This is certainly a point for 
further investigations. A related potential improvement 
is the choice of the actual column to be used. The Spin- 
X column, for example, is available with various matrix 
material and pore sizes. We suspect that if unspecific 
binding to the column is a problem, then this should 
vary with the pore size and DNA fragment size. 

It is noteworthy that other aspects of ChlP-based 
methods need to be considered beyond the aspects cov- 
ered by the current discussion. Most prominent is the 
computational part of the process which provides new 
challenges with the advent of ChlP-Seq [9,13]. This com- 
putational aspect is certainly important for identifying 
false positives. 

Conclusions 

ChlP-Chip or ChlP-Seq are wonderful methods to get 
insights into protein binding to genomes. We try to pro- 
mote these methods by optimizing them and alerting 
other scientists to potential difficulties in data generation 
and interpretation. We agree with Wade and colleagues 
that surprising non-canonical protein-DNA interactions 
can "indicate novel functions for well-studied proteins". 
Examples show that non-canonical binding sites can in- 
deed be functional relevant [19,20]. However, we and 
many others have detected false positives in ChlP-Chip 
experiments and it is not unlikely that some false posi- 
tives have not been recognized as such but interpreted 
and published as real targets. Wade and colleagues write 
in their conclusion that our view that surprising ChlP- 
Chip results are often artifacts is a "dogmatic approach" 
[8]. Our conclusions in that regard were not meant to be 
taken as dogmatic, but rather a respectful caution 
against wasteful scientific pursuit that could be based 
upon erroneous conclusions. The revision of methods 
and criticism of published results of peers is not al- 
ways appreciated, and neither is the prospect of hav- 
ing ones own conclusions questioned. However, it is 
an essential part of scientific progress. We hope that 
other scientists examine the results and argumenta- 
tions published by the Wade group and ourselves and 
come to their own conclusions. For the future, we 
are anticipating new results which we hope will help 
clarify the debated issues surrounding the ChlP-Chip 
method. 
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Debate and criticism must be welcomed in any scientific endeavour; 
however, such debate and criticism must also be based on solid 
experimental evidence. While Schindler and Waldminghaus have responded 
to our critique [14] of their earlier paper, we note that their response offers 
no new experimental evidence or data analysis. Hence, our opinion is 
unchanged: the disputed sites of o 32 binding (DSTs) are genuine, and non- 
canonical sites identified by ChlP-chip or ChlP-seq are not artifacts to be 
disregarded. We feel that three points require specific attention: 

1. The modifications to the ChIP method proposed by Waldminghaus 
and colleagues do not improve data quality. By directly comparing the 
two methods in targeted, controlled ChIP assays, we have clearly 
demonstrated that the standard ChIP method is more effective than the 
modified method at detecting association of o 32 with both well-established 
targets and DSTs. Additional experiments with the transcription factor AraC 
confirm that the general outcome of ChIP experiments is unchanged by the 
use of Spin-X columns during the wash steps; if anything, use of Spin-X 
columns increases signal. We note that most ChIP studies do not use Spin-X 
columns until the wash steps, and this modification to the method was 
applied before the study of Waldminghaus and Skarstad [1], e.g. [21]. Instead 
of representing a methodological improvement, we propose that 
Waldminghaus and Skarstad's o 32 ChlP-chip experiments simply had reduced 
sensitivity due to a combination of the ChIP protocol, antibody, and/or 
growth conditions, all of which differed from our own. 

Although we have focused on o 32 , an independent study of SeqA, using the 
standard ChlP-chip method, yielded almost identical data to those generated 
by Waldminghaus and colleagues. This similarity was recently noted by a 
group who successfully used the standard ChIP method to measure DNA 
binding by SeqA [22]. The data presented by Schindler and Waldminghaus 
in Figure 2A is misleading because the scales are not comparable for the 
two datasets, and the time-points after replication initiation are different 
(signal at the replication origin is expected to drop rapidly following 
replication initiation). 

2. Most, if not all DSTs are genuine sites of a 32 association. Many lines of 
evidence support this. First, we tested 6 DSTs using ChlP/qPCR, with an 
appropriate control; all 6 targets were confirmed. Whilst the remaining 43 
DSTs were not tested, our data strongly suggest that most are genuine sites 
of o 32 association. We find it unrealistic to suggest that every single target 
identified by a ChlP-chip or ChlP-seq experiment should be validated 
individually. Second, while the standard ChIP protocol reliably yields higher 
enrichment values at DSTs in targeted experiments, the modified method 
also detects o 32 binding at 2 out of 4 DSTs tested. Moreover, the ranking of 
DSTs as a group increases significantly among all genomic regions following 
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heat shock in Waldminghaus and colleagues' ChlP-chip data (Mann-Whitney 
U Test, p = ]e 6 ), as would be expected for genuine sites of o 32 association. 
Both of these findings support the idea that discrepancies between the 
datasets are related to differences in assay sensitivity, not the validity of 
targets. Third, nine of the DSTs were identified in independent 
transcriptomic studies [14,15] and we have shown that RNA polymerase 
levels increase at two of four DSTs tested following overexpression of o 32 . 
These two DSTs were not identified by either transcriptomic study, 
indicating that such experiments are not necessarily sensitive enough to 
detect all regulated RNAs. Fourth, DSTs are more likely to be located in 
intergenic regions than expected by chance (Binomial Test, p = 0.00033). 
Schindler and Waldminghaus suggest that DSTs are not real o 32 binding 
sites because many of them lack detectable motifs and/or were not 
detected in transcriptomic studies. However, since DSTs are generally weakly 
bound, they would be expected to bind more degenerate motifs and be 
associated with more subtle changes in transcription. 

3. Non-canonical sites for DNA-binding proteins are often real, and 
studying them is not a wasteful scientific pursuit. We reassert that the 
headline conclusion of Waldminghuas and Skarstad's paper "ChIP on Chip: 
surprising results are often artifacts", and their subsequent criticism of work 
from multiple laboratories, is highly misleading. We and others have 
identified many non-canonical targets for bacterial DNA-binding proteins 
that have been validated in controlled experiments. In fact, 15 of the 47 o 32 
targets identified by Waldminghaus and Skarstad are non-canonical (located 
inside genes or not associated with detectable regulation in transcriptomic 
studies) and, by their own logic, should be mistrusted. Although ChlP-chip 
and ChlP-seq studies in bacteria have been limited in number, a recent 
large-scale ChlP-seq study in Mycobacterium tuberculosis identified hundreds 
of non-canonical transcription factor binding sites [23]. We conclude that 
non-canonical binding sites for bacterial DNA-binding proteins occur often 
and should be the subject of further study precisely because they do not 
conform to the text-book model of transcription regulation. 

ChlP-chip data from our original study [7] are now publicly available from 
the EMBL-EBI ArrayExpress database (E-MTAB-1849). 
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